Syntactic Processing of Unknown Words

نویسنده

  • Gregor Erbach
چکیده

A method for processing sentences which contain unknown words, i. e. words for which no lexical entry exists, is presented. There are three different stages of processing: 1. The sentence with the unknown word is parsed. There are no special requirements for the parsing algorithm, but the lexical lookup procedure needs to be modified. 2. Based on the syntactic structure of the parse, information about the unknown word can be extracted. 3. The information obtained in step 2 may be too fully specified for a lexical entry. Therefore a filter is applied to it to create a new lexical entry. An application of the method is illustrated with examples from Categorial Unification Grammar. The problem of using the extracted information for lexical knowledge acquisition is discussed.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analysis of Unknown Lexical Items using Morphological and Syntactic Information with the TIMIT Corpus

The importance of dealing with unknown words in Natural Language Processing NLP is growing as NLP systems are used in more and more applications One aid in predicting the lexical class of words that do not appear in the lexicon referred to as unknown words is the use of syntactic parsing rules The distinction between closed class and open class words together with morphological recognition appe...

متن کامل

Morpho-syntactic tagging system based on the patterns words for arabic texts

Text tagging is a very important tool for various applications in natural language processing, namely the morphological and syntactic analysis of texts, indexation and information retrieval, "vocalization" of Arabic texts, and probabilistic language model (n-class model). However, these systems based on the lexemes of limited size, are unable to treat unknown words consequently. To overcome thi...

متن کامل

برچسب‌زنی خودکار نقش‌های معنایی در جملات فارسی به کمک درخت‌های وابستگی

Automatic identification of words with semantic roles (such as Agent, Patient, Source, etc.) in sentences and attaching correct semantic roles to them, may lead to improvement in many natural language processing tasks including information extraction, question answering, text summarization and machine translation. Semantic role labeling systems usually take advantage of syntactic parsing and th...

متن کامل

Post Mortem Parsing with Unknown Lexical Items using Morphological Recognition Syntactic Information and a Closed Class Lexicon

The importance of dealing with unknown words in natural language processing NLP is growing as NLP systems are used in more and more applications The ability to parse sentences containing unknown words will make a parsing system more robust and exible The use of syntactic parsing rules provides constraints on the possible lexical categories of unknown words A lexicon of closed class words also o...

متن کامل

Tuning an Existing Nomenclature for Specific Domain Corpora: A Syntax-Based Similarity Method

There is a constant need to extend and tune medical vocabularies to account for new words and new word usages. Robust natural language processing (NLP) tools can be applied to medical texts corpora such as patient narratives and help collect and analyze unknown words1,2. The aim of the present work is to assess the potential for classifying unknown words based on the semantic categories of “nei...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IWBS Report

دوره 131  شماره 

صفحات  -

تاریخ انتشار 1990